Using EASI Sentinel-1 RTC Gamma0 data¶
This notebook demonstrates how to load and use Sentinel-1 Radiometric Terrain Corrected (RTC) Gamma0 data generated in EASI.
The processing uses SNAP-10 with a graph processing tool (GPT) xml receipe for RTC Gamma0 and its variants.
For most uses we recommend the smoothed 20 m product (sentinel1_grd_gamma0_20m).
We can process the 10 m products (sentinel1_grd_gamma0_10m, sentinel1_grd_gamma0_10m_unsmooth) on request. Please also ask if you wish to trial other combinations of the parameters.
RTC Gamma0 product variants¶
| sentinel1_grd_gamma0_20m | sentinel1_grd_gamma0_10m | sentinel1_grd_gamma0_10m_unsmooth | |
|---|---|---|---|
| DEM | |||
| copernicus_dem_30 | Y | Y | Y |
| Scene to DEM extent multiplier | 3.0 | 3.0 | 3.0 |
| SNAP operator | |||
| Apply-Orbit-File | Y | Y | Y |
| ThermalNoiseRemoval | Y | Y | Y |
| Remove-GRD-Border-Noise | Y | Y | Y |
| Calibration | Y | Y | Y |
| SetNoDataValue | Y | Y | Y |
| Terrain-Flattening | Y | Y | Y |
| Speckle-Filter | Y | Y | N |
| Multilook | Y | Y | N |
| Terrain-Correction | Y | Y | Y |
| Output | |||
| Projection | WGS84, epsg:4326 | WGS84, epsg:4326 | WGS84, epsg:4326 |
| Pixel resolution | 20 m | 10 m | 10 m |
| Pixel alignmentPixelIsArea = top-left | PixelIsArea | PixelIsArea | PixelIsArea |
Units and conversions¶
The sentinel1_grd_gamma0_* data are in Intensity units. Intensity can be converted to dB and amplitude, and vice-versa, with the following.
Practical Xarray examples are given below.
Intensity to/from dB:
dB = 10*log10(intensity)
intensity = 10**(dB/10)
Intensity to/from Amplitude:
intensity = amplitude * amplitude
amplitude = sqrt(intensity)
In this notebook we have two functions for xarray datasets/arrays, using numpy.
def intensity_to_db(x):
return 10*numpy.log10(x)
def db_to_intensity(db):
return numpy.pow(10, db/10.0)
Reference: https://forum.step.esa.int/t/what-stage-of-processing-requires-the-linear-to-from-db-command
# Basic plots
%matplotlib inline
# import matplotlib.pyplot as plt
# plt.rcParams['figure.figsize'] = [12, 8]
# Common imports and settings
import os, sys, re
from pathlib import Path
from IPython.display import Markdown
import pandas as pd
pd.set_option("display.max_rows", None)
import xarray as xr
import numpy as np
# Datacube
import datacube
from datacube.utils.rio import configure_s3_access
from datacube.utils import masking
import odc.geo.xr
from dea_tools.plotting import display_map
# EASI tools
import git
repo = git.Repo('.', search_parent_directories=True).working_tree_dir # This gets the current repo directory. Alternatively replace with the easi-notebooks repo path in your home directory
if repo not in sys.path: sys.path.append(repo)
from easi_tools import EasiDefaults, xarray_object_size
from easi_tools.notebook_utils import mostcommon_crs, initialize_dask, localcluster_dashboard, heading
# Data tools
import hvplot.xarray
# import geoviews
import cartopy.crs as ccrs
# Dask
import dask
from dask.distributed import Client, LocalCluster
EASI environment¶
This is for convenience only.
It allows this notebook to be used in any EASI deployment defined as part of the easi-notebooks repository.
Please substitute with your own values if adapting the notebook for your own work.
easi = EasiDefaults()
family = 'sentinel-1'
# product = this.product(family)
product = 'sentinel1_grd_gamma0_20m'
display(Markdown(f'Default {family} product for "{easi.name}": [{product}]({easi.explorer}/products/{product})'))
Successfully found configuration for deployment "csiro"
Default sentinel-1 product for "csiro": sentinel1_grd_gamma0_20m
Dask and ODC¶
# Dask local cluster
cluster = LocalCluster(n_workers=4)
client = Client(cluster)
server = f'https://hub.{easi.domain}' # Or replace if not using EasiDefaults
user = os.environ.get('JUPYTERHUB_SERVICE_PREFIX') # Current user
dask.config.set({"distributed.dashboard.link": f'{server}{user}' + "proxy/{port}/status"}) # port is evaluated by dask
display(client)
# Or use Dask Gateway - this may take a few minutes
# cluster, client = initialize_dask(use_gateway=True, workers=4)
# display(client)
# ODC
dc = datacube.Datacube()
configure_s3_access(aws_unsigned=False, requester_pays=True, client=client);
# List measurements for the product
dc.list_measurements().loc[[product]]
Client
Client-4a7a1646-9cd2-11ef-8c01-a2a699b1fbb8
| Connection method: Cluster object | Cluster type: distributed.LocalCluster |
| Dashboard: https://hub.csiro.easi-eo.solutions/user/csiro-csiro-aad_pag064@csiro.au/proxy/8787/status |
Cluster Info
LocalCluster
b0a29ab6
| Dashboard: https://hub.csiro.easi-eo.solutions/user/csiro-csiro-aad_pag064@csiro.au/proxy/8787/status | Workers: 4 |
| Total threads: 48 | Total memory: 24.00 GiB |
| Status: running | Using processes: True |
Scheduler Info
Scheduler
Scheduler-7a757b8b-9a4b-4e5c-b21a-76413a0f4aea
| Comm: tcp://127.0.0.1:38115 | Workers: 4 |
| Dashboard: https://hub.csiro.easi-eo.solutions/user/csiro-csiro-aad_pag064@csiro.au/proxy/8787/status | Total threads: 48 |
| Started: Just now | Total memory: 24.00 GiB |
Workers
Worker: 0
| Comm: tcp://127.0.0.1:43925 | Total threads: 12 |
| Dashboard: https://hub.csiro.easi-eo.solutions/user/csiro-csiro-aad_pag064@csiro.au/proxy/42709/status | Memory: 6.00 GiB |
| Nanny: tcp://127.0.0.1:42367 | |
| Local directory: /tmp/dask-scratch-space/worker-ywbbajfe | |
Worker: 1
| Comm: tcp://127.0.0.1:46211 | Total threads: 12 |
| Dashboard: https://hub.csiro.easi-eo.solutions/user/csiro-csiro-aad_pag064@csiro.au/proxy/45147/status | Memory: 6.00 GiB |
| Nanny: tcp://127.0.0.1:38265 | |
| Local directory: /tmp/dask-scratch-space/worker-9lsutqwz | |
Worker: 2
| Comm: tcp://127.0.0.1:40269 | Total threads: 12 |
| Dashboard: https://hub.csiro.easi-eo.solutions/user/csiro-csiro-aad_pag064@csiro.au/proxy/41417/status | Memory: 6.00 GiB |
| Nanny: tcp://127.0.0.1:46711 | |
| Local directory: /tmp/dask-scratch-space/worker-y8lezuji | |
Worker: 3
| Comm: tcp://127.0.0.1:38553 | Total threads: 12 |
| Dashboard: https://hub.csiro.easi-eo.solutions/user/csiro-csiro-aad_pag064@csiro.au/proxy/37817/status | Memory: 6.00 GiB |
| Nanny: tcp://127.0.0.1:42443 | |
| Local directory: /tmp/dask-scratch-space/worker-f50s1thg | |
| name | dtype | units | nodata | aliases | add_offset | scale_factor | flags_definition | spectral_definition | ||
|---|---|---|---|---|---|---|---|---|---|---|
| product | measurement | |||||||||
| sentinel1_grd_gamma0_20m | vh | vh | float32 | intensity | NaN | [gamma0_vh] | NaN | NaN | NaN | NaN |
| vv | vv | float32 | intensity | NaN | [gamma0_vv] | NaN | NaN | NaN | NaN | |
| angle | angle | float32 | degrees | NaN | [local_incidence_angle, localincidenceangle] | NaN | NaN | NaN | NaN |
Choose an area of interest¶
# Set your own latitude / longitude
# Australia GWW
latitude = (-33, -32.6)
longitude = (120.5, 121)
time = ('2020-01-01', '2020-01-31')
# PNG
# latitude = (-4.26, -3.75)
# longitude = (144.03, 144.74)
# time = ('2020-01-01', '2020-05-31')
# Bangladesh
# latitude = (21.5, 23.5)
# longitude = (89, 90.5)
# time = ('2024-05-01', '2024-06-10')
# Vietnam
# epsg:32648
# latitude = (9.1, 9.9)
# longitude = (105.6, 106.4)
# time = ('2024-01-01', '2024-09-10')
display_map(longitude, latitude)
Load data¶
data = dc.load(
product = product,
latitude = latitude,
longitude = longitude,
time = time,
dask_chunks = {'latitude':2048, 'longitude':2048}, # Dask chunk size
group_by = 'solar_day', # Group by day method
)
display(xarray_object_size(data))
display(data)
'Dataset size: 286.14 MB'
<xarray.Dataset> Size: 300MB
Dimensions: (time: 5, latitude: 2000, longitude: 2500)
Coordinates:
* time (time) datetime64[ns] 40B 2020-01-02T21:18:08.500000 ... 202...
* latitude (latitude) float64 16kB -32.6 -32.6 -32.6 ... -33.0 -33.0 -33.0
* longitude (longitude) float64 20kB 120.5 120.5 120.5 ... 121.0 121.0
spatial_ref int32 4B 4326
Data variables:
vh (time, latitude, longitude) float32 100MB dask.array<chunksize=(1, 2000, 2048), meta=np.ndarray>
vv (time, latitude, longitude) float32 100MB dask.array<chunksize=(1, 2000, 2048), meta=np.ndarray>
angle (time, latitude, longitude) float32 100MB dask.array<chunksize=(1, 2000, 2048), meta=np.ndarray>
Attributes:
crs: EPSG:4326
grid_mapping: spatial_refConversion and helper functions¶
# These functions use numpy, which should be satisfactory for most notebooks.
# Calculations for larger or more complex arrays may require Xarray's "ufunc" capability.
# https://docs.xarray.dev/en/stable/examples/apply_ufunc_vectorize_1d.html
#
# Apply numpy.log10 to the DataArray
# log10_data = xr.apply_ufunc(np.log10, data)
def intensity_to_db(x):
return 10*np.log10(x)
def db_to_intensity(db):
return np.pow(10, db/10.0)
def make_image(ds: 'xarray', frame_height=300, **kwargs):
"""Return a Holoviews object that can be displayed or combined"""
# TODO select spatial dim names from the given xarray
defaults = dict(
cmap="Greys_r",
x = 'longitude', y = 'latitude',
rasterize=True,
geo=True,
frame_height=frame_height,
)
defaults.update(**kwargs)
return ds.hvplot.image(**defaults)
def select_valid_time_layers(ds: 'xarray', percent: float):
"""Select time layers that have at least a given percentage of valid data (e.g., >=5%)
Example usage:
selected = select_valid_time_layers(ds, 0.05)
filtered == ds.sel(time=selected)
"""
# TODO select spatial dim names from the given xarray
return ds.count(dim=['latitude','longitude']).values / (data.sizes['latitude']*data.sizes['longitude']) >= percent
# Optional time layer filter
# Select time layers with at least 5% of valid pixels
# selected = data.vv.count(dim=['latitude','longitude']).values / (data.sizes['latitude']*data.sizes['longitude']) >= 0.05
# data = data.sel(time=selected).persist()
db_data = intensity_to_db(data).persist()
Plot the data¶
# A single time layer for VV and VH, with linked axes
vvplot = make_image(data.vv.isel(time=1), clim=(0, 0.5), title=f'VV ({data.time.dt.strftime("%Y-%m-%d %H:%M:%S").values[0]})', clabel='Intensity')
vhplot = make_image(data.vh.isel(time=1), clim=(0, 0.1), title=f'VH ({data.time.dt.strftime("%Y-%m-%d %H:%M:%S").values[0]})', clabel='Intensity')
vvplot + vhplot
/env/lib/python3.10/site-packages/dask/core.py:127: RuntimeWarning: invalid value encountered in log10 return func(*(_execute_task(a, cache) for a in args)) /env/lib/python3.10/site-packages/dask/core.py:127: RuntimeWarning: invalid value encountered in log10 return func(*(_execute_task(a, cache) for a in args)) /env/lib/python3.10/site-packages/dask/core.py:127: RuntimeWarning: invalid value encountered in log10 return func(*(_execute_task(a, cache) for a in args)) /env/lib/python3.10/site-packages/dask/core.py:127: RuntimeWarning: divide by zero encountered in log10 return func(*(_execute_task(a, cache) for a in args)) /env/lib/python3.10/site-packages/dask/core.py:127: RuntimeWarning: invalid value encountered in log10 return func(*(_execute_task(a, cache) for a in args)) /env/lib/python3.10/site-packages/dask/core.py:127: RuntimeWarning: invalid value encountered in log10 return func(*(_execute_task(a, cache) for a in args)) /env/lib/python3.10/site-packages/rasterio/warp.py:344: NotGeoreferencedWarning: Dataset has no geotransform, gcps, or rpcs. The identity matrix will be returned. _reproject( /env/lib/python3.10/site-packages/dask/core.py:127: RuntimeWarning: divide by zero encountered in log10 return func(*(_execute_task(a, cache) for a in args)) /env/lib/python3.10/site-packages/dask/core.py:127: RuntimeWarning: invalid value encountered in log10 return func(*(_execute_task(a, cache) for a in args)) /env/lib/python3.10/site-packages/dask/core.py:127: RuntimeWarning: invalid value encountered in log10 return func(*(_execute_task(a, cache) for a in args))
# Make a dB plot
vvplot = make_image(db_data.vv.isel(time=0), clim=(-30, -3), title=f'VV ({data.time.dt.strftime("%Y-%m-%d %H:%M:%S").values[0]})', clabel='dB')
vhplot = make_image(db_data.vh.isel(time=0), clim=(-30, -1), title=f'VH ({data.time.dt.strftime("%Y-%m-%d %H:%M:%S").values[0]})', clabel='dB')
vvplot + vhplot
# Subplots for each time layer for VV, with linked axes
make_image(db_data.vv, clim=(-30, -3)).layout().cols(4)
Make an RGB image¶
For an RGB visualization we use the ratio between VH and VV.
# Add the vh/vv band
db_data['vh_vv'] = db_data.vh / db_data.vv
# Scale the measurements by their median so they have a similar range for visualization
med = db_data / db_data.median(dim=['latitude','longitude'])
med.persist()
<xarray.Dataset> Size: 400MB
Dimensions: (time: 5, latitude: 2000, longitude: 2500)
Coordinates:
* time (time) datetime64[ns] 40B 2020-01-02T21:18:08.500000 ... 202...
* latitude (latitude) float64 16kB -32.6 -32.6 -32.6 ... -33.0 -33.0 -33.0
* longitude (longitude) float64 20kB 120.5 120.5 120.5 ... 121.0 121.0
spatial_ref int32 4B 4326
Data variables:
vh (time, latitude, longitude) float32 100MB dask.array<chunksize=(1, 2000, 2048), meta=np.ndarray>
vv (time, latitude, longitude) float32 100MB dask.array<chunksize=(1, 2000, 2048), meta=np.ndarray>
angle (time, latitude, longitude) float32 100MB dask.array<chunksize=(1, 2000, 2048), meta=np.ndarray>
vh_vv (time, latitude, longitude) float32 100MB dask.array<chunksize=(1, 2000, 2048), meta=np.ndarray># RGB plot using a DEA Tools function
# https://github.com/GeoscienceAustralia/dea-notebooks
# https://github.com/digitalearthafrica/deafrica-sandbox-notebooks/
sys.path.append('/home/jovyan/dea-notebooks/Tools')
from dea_tools.plotting import rgb
rgb(med, bands=['vh','vv','vh_vv'], col="time")
/env/lib/python3.10/site-packages/dask/core.py:127: RuntimeWarning: invalid value encountered in divide return func(*(_execute_task(a, cache) for a in args)) /env/lib/python3.10/site-packages/matplotlib/cm.py:494: RuntimeWarning: invalid value encountered in cast xx = (xx * 255).astype(np.uint8)
# Experimental - Use Holoviews
# Create an RGB array, and persist it on the dask cluster
rgb_ds = xr.concat([med.vv, med.vh, med.vh_vv], 'channel').rename('rgb').to_dataset().persist()
# Plot the RGB
rgb_plot = rgb_ds.hvplot.rgb(
bands='channel',
groupby='time', rasterize=True,
geo=True,
title='RGB', frame_height=500,
)
rgb_plot # + vv_plot + vh_plot
WARNING:param.GeoRGBPlot05006: Clipping input data to the valid range for RGB data ([0..1] for floats or [0..255] for integers). WARNING:param.GeoRGBPlot05006: Clipping input data to the valid range for RGB data ([0..1] for floats or [0..255] for integers).
Export to Geotiffs¶
Recall that to write a dask dataset to a file requires the dataset to be .compute()ed. This may result in a large memory increase on your JupyterLab node if the area of interest is large enough, which in turn may kill the kernel. If so then skip this step, choose a smaller area or find a different way to export data.
# Make a directory to save outputs to
target = Path.home() / 'output'
if not target.exists(): target.mkdir()
def write_band(ds, varname):
"""Write the variable name of the xarray dataset to a Geotiff files for each time layer"""
for i in range(len(ds.time)):
date = ds[varname].isel(time=i).time.dt.strftime('%Y%m%d').data
single = ds[varname].isel(time=i).compute()
write_cog(geo_im=single, fname=f'{target}/example_sentinel-1_{varname}_{date}.tif', overwrite=True)
write_band(data, 'vv')
write_band(data, 'vh')
# write_band(rgb_da, 'rgb')